Greg Detre
@14:30 Tuesday, October 08, 2002
with Federica Busa
why did GL
makes certain choices? how does it differ from WN?
what do we
mean by the �lexicon�?
Push: is that the same as the question of what we consider linguistic
and non-linguistic?
GL has made
the choice of looking at sentences to help make the generalisations
you can�t say �a good cloud� as easily as �a good chair� � presumably,
Pustejovsky would say this is because the �cloud� doesn�t carry the telic
information that you need for the modifier �good� to attach to � that is, in
order for �good cloud� to make sense, you have to construct a virtual cloud
type within a context, right???
Deb: are we trying to create a type system of discrete types based on a
softer/continuous underlying system?
surely not, because the creation of virtual types is dynamic � in the
same way that �audience� is a word that comes into being on the fly
actually, she thinks the case of �audience� is a bit different, because
an audience�s persistence is based on its referents, whereas a cloud�s
referents are permanent (whether or not I�m there)
Deb: can things be a little bit complex?
what is complex???
= �(dot objects) to model objects with multiple and interdependent
denotations� (Pustejovsky)
in Wordnet:
synset � one semantic node with many lexical items
qualia
structure � internal syntax for structuring concepts
they admit that their particular structure is contingent � (so how do
they defend it???)
Deb: mental vs ideal?
you�d have to ask Pustejovsky � she�s not convinced about that
functional types � can be recursively generated, by embedding a simple
type in larger numbers of qualia
Deb: is there anywhere to encode temporal or
modal logic?
GL says it�s not the job of the lexicon to tell you which
interpretation, only to generate the possibilities
SIMPLE was the project of defining the
language-independent knowledge representation for 12 European languages
demonstrates how GL would differentiate between all the different senses
of a word using the different qualia parameters
Deb: the vast proportion of words that kids use
early on don�t fit into the GL tripartite (events/objects/qualities) schema
other words that GL has trouble with: yes/no, more, I, and/or, quantifiers etc.
Miguel: perhaps they�re in the grammar� or in the internal structure of the knowledge base � they�re operational. they�re closed.
what does it mean to say that they�re �closed�???
apparently the pronouns are a closed grammatical system
difficult to say whether prepositions closed or open?
in GL, a lot of these would be operators � Federica doesn�t think you�d want these in your type system
e.g. you�d fit �want� into the event hierarchy (with telic and other qualia bits and bobs(???))
type composition
the �generative� in GL arises from the compositionality of the qualia relations
Deb: if there are only 10ish operators, does that mean that there are only 10 possible interpretations per word? no, there�s as many as you have parameters�(???)
Push: isn�t this how linguistic analysis has always been done?
at least, in the AI community, this method of concept-formation is not new
Federica thinks it�s certainly an improvement from the huge enumerations you get in the linguistic community
she says she�s not aware of any system that�s done this on a very large scale in the AI community
Peter: how did you test coverage?
they had an algorithm which went through huge amounts of text, and it would try and pull out everything that it thought was a compound � then they would look over it afterwards
example: they were working on a corpus based on travel books � dynamically generated categories, e.g. �French villages�, �river villages� from �charming towns in Europe�
the rules for compounding and for nouns-preposition-nouns are exactly the same �towns in Europe�
started with a corpus of 200-300,000 book descriptions � it went wrong when that went up to 2.5m
that included an awful lot of gumph though that you�d want to filter out first
is this intended to fully describe a concept, or simply distinguish it??? I think it�s intended as a full description � but GL amounts to little more than word association without meaning � same problem as Quillian�s semantic nets
Deb: this is the same as the original question, �what needs to be in a lexicon?�
she doesn�t think it that the lexicon entry �book� needs much more than that it�s telic, and got some genre information etc. � that�s all you need for your type system � then you need the composition rule, detailing the operations you perform on �book� in terms of deployment � she thinks as a linguist, rather than in terms of information retrieval � the lexicon provides the prototypes
how you apply the four levels of representation to discourse models�(???)
important, given the state of current language technology, you need to have a well-bounded problem
Push: it�s easy to see other applications for which this won�t be enough, e.g. �I want a book for my four year old daughter?� � that won�t be in the lexicon
top ontology for EuroWordNet
1stOrderEntity/2ndOrderEntity
I think she said it�s in line with the GL, in that it has qualia-like structuring at the top level (function/composition/origin/form)
Deb: the 1st vs 2nd order split in EuroWordnet is quite essential
she says that their 2nd order hierarchy is really the GL event hierarchy
in Lexeme: had the prepositions fairly high up
EuroWordNet � pretty much completed, she thinks
Deb: does that means it�s dead?
it might just be the way that European funding works, in terms of finite/fixed-term deliverables
� metatags for the synset
original question: what should be in the lexicon?
knowledge that has syntactic consequences�???
Deb: perhaps you can�t draw a line between what�s lexical and what�s not
so, the lexicon is everything, in that it does know what sort of books to buy for a 4-year old
very idiosyncratic set of knowledge + experience, but you still wouldn�t want to exclude any of it from what he�d view of as the lexicon
it certainly seems implausible to me that you can draw any real/meaningful/clean distinction in the brain between lexicon etc.
to what extent do I think that grammar should be contained in the lexicon???
Push: can�t do meaning purely compositionally, e.g. �difficult�
I would want to define/rest the concept of �difficult� upon effort, which is a physical sensation� right???
in order to try and infer qualia from some corpus, you have to start from knowing something
apparently, e.g. prepositions are very useful to know in English
Deb: asked Push about the OpenMind data collection � how do you categorise it? do they correspond to the GL categories?
they went in with a certain preconceived idea of what they�d want to know
Hugo is working on turning the OpenMind database into a Wordnet-like lexical database
what would be really cool would be have some means of converting the OpenMind database into a lexical database dynamically according to your chosen parameters for the day, so you could have a 30-type top level, or a binary tree, or whatever you felt like, to see which worked best at answering queries and fitting intuitions, eh???
except, this is more or less the problem of AI L what does this actually mean/how would you actually go about it???
no class next
week (15th October)
then
reading the Brooks + Cantwell for 22nd October
future
papers � looking at grounding and non-symbolic�
will the
presentations be on-line? depends on the individuals who wrote them
what you�re doing, how you�re doing it, what you�re hoping to achieve
prefer individual
can the GL
be flexible about having multiple ways of cutting up the world (as opposed to
e.g. entities/actions/qualities)???
I�m pretty sure it can�t � hmmm, having said that, the only commitments it
really seems to make are down to the (still very high) level of
natural/functional/complex � presumably, SIMPLE made its own ontological
commitments within the framework of GL
temporal or
modal logic???
events/objects/qualities vs entities/actions/qualities???
1stOrderEntity vs 2ndOrderEntity in EuroWordNet???
look at Minsky�s 16ish top-level nodes (in Society of Mind), similar to Jackendorff�s
can we see Wordnet (bottom-up) and GL (top-down) as being opposite directions towards extensional categorisation???
how is the OpenMind database stored???